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' Abstract 

^ ' We establish an information inequality that is intimately connected to the evaluation of the 

^ ^ , sum rate given by Marton's inner bound for two receiver broadcast channels with a binary input 

alphabet. This generalizes a recent result where the inequality was established for a particular 
channel, the binary skew-symmetric broadcast channel. The inequality implies that randomized 
time-division strategy indeed achieves the sum rate of Marton's inner bound for all binary input 
broadcast channels. 

^ ■ 1 Introduction 

A two-receiver broadcast channel models the communication scenario where two (independent) 
. messages are to be transmitted from a sender X to two receivers V, Z. Each receiver is interested in 

00 \ decoding his/her message. A transition probability matrix given by z\x) models the stochastic 

■ nature of the errors introduced during the communication. For formal definitions and early results 

the reader can refer to [H [2]. 

O ' 1.1 Background 

o ■ 

The following region obtained by Marton[3] represents the best-known achievable region to-date: 



X 



Bound 1. The set of rate-pairs (i?i,i?2) satisfying the following constraints: 



^. Ri<I{U,W-Y) 

R2 < I{V, W; Z) 

Ri + R2< mm{I{W] Y), I{W; Z)} + I{U; Y\W) + I{V; Z\W) - I{U; V\W) 

for any set of random variables {U, V, W) such that {U, V, W) X ^ (Y, Z) forms a Markov chain 
are achievable. 

Recently Gohari and Ananthramjl] used a remarkable perturbation-based argument to establish 
that it suffices to consider {U,V,W) with alphabet sizes bounded by \U\ < \X\, \V\ < \X\, \W\ < 
\X\ -|- 4 to compute the extreme points of Bound [H In general the computation of Marton's inner 
bound is difficult, and prior to [J, this bound was not strictly evaluatable. Even with these bounds 
on cardinalities, explicit evaluation of the bounds is still a difficult task. 

The following region represents an outer-bound to the capacity region of the broadcast channel. 



1 



1 INTRODUCTION 



Bound 2. 1^ The union of rate-pairs {Ri,R2) satisfying the following constraints: 



Ri<I{U;Y) 

R2<I{V;Z) 
Ri + R2 < I{U;Y) + I{V;Z\U) 
Ri + R2<I{V;Z) + I{U;Y\V) 



over all pairs of random variables {U, V) such that (U, V) ^ X ^ (Y, Z) forms a Markov chain 
forms an outer-hound to the capacity region of the broadcast channel. 

The capacity regions of special classes of broadcast channels have been established and in every 
case it turns out that Bounds [T] and [2] agree. In order to study whether the Bounds [T] and [2] 
are indeed different or whether they are different representations of the same region, the authors 
[6] studied a particular channel called the binary skew-symmetric broadcast channel (BSSC). The 
authors conjectured that for BSSC the following inequality holds: 



The authors further showed that, assuming ([T]) holds, the Bounds [T] and [2] differed for BSSC. 

In [3], the authors established that Bounds [1] and [2] were indeed different for BSSC without 
actually establishing that ([T]) was true. They verified that ([T]) was indeed plausible by confirming 
it for a large number of (randomly-generated) samples from the cardinality constrained space. 

In [7] the validity of the inequality ([I]) was established rigorously using a modification of the 
perturbation-based arguments[3]. Further the authors[3 also established that in order to compute 
the maximum sum-rate for Marton's inner bound it suffices to consider \W\ < \X\, \U\ < \X\, \V\ < 
\X\, a mild improvement over the results of [3] for the sum-rate computation. Further this result 
also quantifies the gap between the sum-rate estimates given by the inner and outer bounds for the 
BSSC. 

1.2 Summary of results 

The main result of the paper is the following: 

Theorem 1. Consider a five tuple of random variables {U,V, X,Y, Z) such that {U,V) X 
{Y,Z) forms a Markov chain and further let \X\ = 2. Then the following inequality holds: 



This generalizes ([T]) to be true for every binary-input broadcast channel. Combining this result 
with the cardinality bounds for the sum-rate obtained in [7], we also establish that the maximum 
sum rate given by Marton's coding strategy indeed matches that given via the randomized time- 
division strategy [5], a much simpler achievable strategy for any binary input broadcast channel. 

Corollary 1. The maximum value of the sum-rate for Marton's inner bound for any binary-input 
broadcast channel is given by 



I{U- Y) + I{V- Z) - I{U; V) < max{/(X; Y),I{X- Z)}. 



(1) 



/([/; Y) + I{V; Z) - I{U; V) < max{/(X; Y),I{X; Z)}. 



(2) 



p{'w,x) 



max min{/(iy; Y),I{W; Z)} + P{W = 0}I{X; Y\W = 0) + P{W = Z\W = 1) 



where \W 



= 2. 
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1.2.1 Randomized time-division strategy 

Randomized time-division (R-TD) strategy[2] corresponds to an achievable strategy for the fohow- 
ing setting of {U, V, W) in Bound [TJ 1^ = imphes that U = X,V = 0; and W = 1 implies that 
V = X,U = ^ (where refers to the trivial random variable). Observe that this corresponds to a 
time-division strategy except that the slots for which communication occurs to one receiver is also 
drawn from a codebook which conveys additional information. 

1.2.2 Relationship between Theorem [1] and r5 

Recently there has been a lot of interest in information inequalities and the study of the structure 
of the entropic space TJ^. Theorem [1] refers to a subset, S, of points in r^: those corresponding 
to a five tuple of random variables (U, V, X, Y, Z) such that {U, V) ^ X ^ (Y, Z) forms a Markov 
chain and with a binary constraint on the cardinality of X, i.e. \X\ = 2. It shows that the points 
in S have to lie in the union of two half-spaces induced by the two hyperplanes: 



Since the inequalities are tight, S is not a convex region in general. The non-convexity of the 
region also gives a heuristic reasoning as to why Shannon-type inequalities may not be sufficient to 
establish Theorem [TJ 

Before we go into the proof, we will show how Corollary [1] follows from Theorem [TJ 



over all choices of {U, V, W) X ^ (Y, Z) it suffices to restrict to \W\ = \X\. 

Hence it follows that to evaluate the Marton's sum-rate for binary input broadcast channel it 
suffices to look at \W\ = 2. 

Thus we need to show that the maximum sum-rate R obtained by the randomized time-division 
strategy indeed matches the maximum sum rate R given by Marton's inner bound. 

Proof. Clearly, we have i? > ^ as is a restriction of the choice of U,V,W. 

Consider a U,V,W that achieves the maximum sum-rate R. We consider two cases: 



/([/; Y) + I{V; Z) - I{U- V) < I{X- Y) 
I{U; Y) + I{V; Z) - I{U; V) < I{X; Z). 




\I{W; Y) + {1- \)I{W; Z) + /([/; Y\W) + I{V- Z\W) - I{U; V\W),0 < A < 1 



Case 1: 



I{X; Y\W = 0) > I{X; Z\W = 0) and I{X; Y\W = 1) > I{X; Z\W 
I{X; Z\W = 0) > I{X; Y\W = 0) and /(X; Z\W = 1) > I{X; Y\W 



1), or 
!)• 



W.l.o.g. say the former holds, i.e. 



I{X; Y\W = 0) > I{X; Z\W = 0) and I{X; Y\W = 1) > I{X; Z\W 



!)• 



(3) 
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Clearly 

R = mm{I{W; Y), I{W; Z)} + I{U; Y\W) + I{V; Z\W) - I{U; V\W) 

= mm{I{W] Y),I{W; Z)] + V{W = 0) (l(C/; Y\W = 0) + I{V- Z\W = 0) - /([/; V\W = 0)) 
+ V{W = 1){I{U; Y\W = 1) + I{V; Z\W = 1) - I{U; V\W = 1)) 

(a) 

< mm{I{W; Y), I{W; Z)} + P{W = 0)I{X;Y\W = 0) + P{W = Y\W = 1) 

< mm{I{W; Y),I{W; Z)} + I{X- Y\W) < I{X; Y) < R, 

where (a) follows from Theorem [1] and ([3]). 

Case 2: 

I{X- Y\W = 0) > I{X- Z\W = 0) and I{X- Z\W = 1) > I{X; Y\W = 1). (4) 
Observe that 

R = min{/(M^; Y)J{W; Z)] + I{U; Y\W) + I{V; Z\W) - I{U; V\W) 

= min{/(P^; Y),I{W; Z)} + P{W = 0) (/([/; Y\W = 0) + I{V; Z\W = 0) - I{U; V\W = 0)) 
+ P{W = 1) (/([/; Y\W = 1) + I{V; Z\W = 1) - I{U; V\W = 1)) 

< min{/(T^; Y),I{W; Z)} + P{W = 0)I{X;Y\W = 0) + P{W = Z\W = 1) < R, 

where (a) follows from Theorem [T] and 

This implies R < R and thus we complete the proof of Corollary [TJ □ 

3 Proof of Theorem [1] 

The idea of the proof is to fix a p{y, z\x) (i.e. a particular broadcast channel) and show that for all 
Po{x) we have that 

max I{U; Y) + I{V; Z) - I{U; V) < max{/(X; Y),I{X; Z)}. 

p{u,v,x):p{x)=Po{x) 

Denote LHS and RHS as the left-hand side and right-hand side of the inequality ([2]), respec- 
tively. Let puv = P{U = u,V = v). Also we use the following notation: U AV (and), U y V (or), 
U®V (xor), U (not). 

Remark 1. From [3] (or see Fact 1 and Claim 1 in |7] for a self-contained shorter proof) it suffices 
to establish Theorem [1] for the scenario \U\ < |A:'|,|V| < \X\ and X = f{U,V), a deterministic 
function of {U, V). 

The outline of the proof is as follows: 

1. We first prove the inequality for some special settings, or "trivial" cases. (Section 13. ip 

2. We show that it suffices to prove for the nontrivial cases X = U A V and X = U (B V. 
(Section [32]) 

3. For X = U AV, we show that the nontrivial maximum of LHS can only be achieved when 
at least two of {poo,Poi,Pio} equal zero. This reduces the setting to one of the trivial cases. 
(Section 13. 3p 
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4. For X = U (BV, we show that the nontrivial maximum of LHS can only be achieved when 
at least one oi puv equals zero, which is reduced to the case X = U AV. (Section I3.4p 

For a binary-input channel X ^Y, let {aj,aj} denote the transition probabilities, where 
P(Y = i\X = 0) = ai, F(Y = i\X = I) = a^, i = l,...,N. 

Similarly let 

p(Z = i\X = 0) = bi, P{Z = i\X = I) = bi, i = l,...,N. 

Remark 2. W.l.o.g. we can assume that all the terms, {ai,di,bi,bi} are non-zero (or in general 
positive). The validity of the inequality at boundary points, i.e. some of {oj, Oj, ftj, ftj} are zero 
follows from the continuity of mutual information. 

Notation. X J-Y: X and Y are independent. 

Since U ^ X ^ Y and V ^ X ^ Z are Markov chains, from data processing inequality, we 
know 

I{U; Y) < I{X; Y), I{U; Y) < I{U; X), 

I{V;Z)<I{X;Z), I{V; Z) < I{V; X). (5) 
With these inequalities, we first prove Theorem [1] for some special settings. 

3.1 Proof for Special Settings 

551. ai = at. Then X ± Y, and thus I{U;Y) = I{X;Y) = 0. Thus from §^ and the non- 
negativity of I{U ; V) we have I{V; Z) — I{V; U) < I{X; Z), i.e. Theorem [1] holds. Similarly 
Theorem [1] holds when bi = bi. 

552. U L X. Then /([/; Y) = I{U; X) = 0. Again from (P and the non-negativity of I{U; V) 
Theorem [T] holds. Similarly when V -L X, Theorem [T] also holds. 

3.2 Two Nontrivial Cases 

According to Remark[Tl to prove the inequality ([2]), it suffices to consider X = f{U, V) with binary 
U and V. Notice there are 16 possible functions /, and they can be classified into the following 
equivalent (equivalence is due to relabeling) groups 



Gi: 


X 


= {0},X = {1} 






G2: 


X 


= u,x = U,X = V,X - 


= V 




G3: 


X 


= U AV,X = U AV,X 


= U AV,X = 


U AV 


G4: 


X 


= UVV,X = U\/V,X 


= uvv,x = 




G5: 


X 


= u ev,x = u ®v 







The reason that these are equivalent groups is that, in each group, all the cases can be reduced to 
the first case by using some bijections. For example, in G3, let the distributions of {U, V) be p{u, v) 
and r{u,v) for X = U AV and X = U AV, respectively. The bijection is poo ^ ^lOi Poi ^ fu, 
Pio ^ ^00) Pii ^ fQi- Thus, we just need to prove Theorem [1] for the first function in each group. 
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Further, notice for the case X = UW with q{u, v), by bijection poo ^ Qii^ Poi ^ Qoii Pio ^ Qio, 
Pii goo, we can also use the same proof as for the case X = U AV . That is, we use the fact that 
X = U W 4^ X = U AV to reduce the proof of the "or" case of one channel to the "and" case of 
another broadcast channel obtained by flipping U, V, and X. 

So it remains to consider the first cases of the groups except G4. 

The first two cases are trivial. For X = {0}, the theorem is reduced to —I{U;V) < 0. For 
X = U, i.e. I{U; Y) = I{X; Y), the theorem follows from the data processing inequality, I{V; Z) < 
I{V;U) = I{V]X) (see Eqn.([5])). So finally we just need to consider the following two nontrivial 
cases: 

C3: X = U AV 

C5: X = U®V 

3.3 Proof for Case X = U AV 

In this case, F(X = 0) = pu. Now we fix pn, the RHS keeps unchanged with given Y and Z. If 
pii equals to (or 1), then X = {0} (or X = {!}), and it reduces to the group Gi. So we just 
need to consider pn G (0, 1). Take (picPoi) as the free variables, with poo = 1 — Pu — Poi — Pio- 
Thus the region of possible (piOiPoi) is a right triangle containing the interior. The basic idea of 
the proof is that: 

1. We first prove Theorem [1] at the vertices of the region of (picPoi)- (Section |3. 3. ip 

2. Then we show that any nontrivial local maxima of LHS can only be one vertex of {pio,Poi)- 
(Section [3X21 and [3^ 

3.3.1 Case C3-1: at least two of poo 5^011 Pio = 

When this happens, the condition reduces to {X = U,V = 1} or {X = V,U = 1} or {U = V = X}; 
which belong to group G2, where Theorem [1] holds. Here we mention that with pn < 1, these three 
probabilities cannot be zero simultaneously. However, for clarity, we still use "at least two" instead 
of "exactly two". 

3.3.2 Case C3-2: exactly one of poOi Poi j Pio = 

For these cases, we show that nontrivial local maxima does not exist. Consider a Lyapunov pertur- 
bation q{u,v,x) = p{u,v,x)[l + eL{u,v)],£ S TZ that maintains P(X = 0). This implies that the 
perturbation satisfies 



For any valid perturbation, at any local maxima of LHS, the first and second derivatives w.r.t. e 
must be = and < 0, respectively. Thus 



-^^11 = 0, pooLoo + PoiLoi + pioLiQ = 



(6) 



Hl{U, V) = HE[L\u,Y]iU, Y) + He[l\v,z]{^, Z) 
E[E[L\U, Vf] > E[E[L\U, Yf] + E[E[L\V, Zf] 



(7) 

(8) 
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where 



and 



Hl{U, V) = - pooLqo log Poo - PoiLqi logpoi - PioLw logpio 
He[l\u,y](U,Y) = - ^ai(poo-^oo +?'oi-^^oi)log[ai(poo +Poi)] 

- X] o-iPw^w log[ajpio + aipii] 
He[l\v,z]{V, Z) bi{pooLoo + pio^io) log[6i(poo + Pw)] 

- ^ bipoiLoi log[bipoi + bipii], 



E[E[L\U, Vf] = pooLlo + poi^oi + Pio^?o 

2i _ (POO^OO +?'01^0l)^ , «iPlO-^10 



Poo + Poi ^ «iPlO + fliPll 



E[E[L\V, Zf] = (Poo-^oo+Pio-^io)^ + ^ , ^^Pol^ol 



Poo + PlO ^ 6iPoi + &iPll 

Case 1: poo = 0,Poi,Pio,Pn > 

In this case, condition ([8]) imphes that the fohowing inequahty holds for all valid perturbations 
satisfying ([6]): 

PoiLl, + p,oLl, > + poiLl + 



2 

j/^Ol-^Ol 



^iPOl + ^iPll 



Q > Q-fPio-^io _^ ^fPoi-^oi 



OiPlO + fliPll ^ biPoi + 6iPii 

However, when PoiiPiOiPii > 0, this cannot hold for all valid perturbations. 

Case 2: poi = 0,poo,Pio,Pii > 

In this case, condition ([7]) implies that 

logpio - log Poo = a* log[aipio + fliPn] - ^ a* log[aipoo] 
=^ E fli log[aipio] = E fli log[aipio + OiPii] 

This equality cannot hold since flj, aj,pio,Pii > (see Remark [2]). 

Case 3: pio = 0,poo,poi,Pii > 

Just as in Case 2, condition ([7]) implies that 

bi log[&ipoi] = Y,bi log[6ipoi + ^iPii] 

As before, this equality cannot hold since bi,bi,poi,pii > 0. 
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3.3.3 Case C3-3: all p^v > 

As P{X = 1) = pii (equivalently H{Y), H{Z) via the Markov chain (U, V) ^ X ^ {Y, Z)) is kept 
fixed, the local maxima of LHS is the same as that of 

/(pio,m) = H{U, V) - H{U, Y) - H{V, Z) 

= -poQ log Poo - Pm logpoi - PiQ logpio - Pii logpii 

+ ^aj(poo +Poi)log[ai(poo +Poi)] + X^(«iPio + fliPii) log[ajpio + aiPii] 
+ ^h{PoG +Pio)log[&i(poo +P10)] + ^{hPm + biPii)^og[hPm + hPii]- 

At any local maxima, the gradient V/ and Hessian matrix V^/ must satisfy 

V/ = 0, V2/ < 0, (9) 

where V^/ < denotes that V^/ is negative semi-definite. We now compute the gradient and the 
Hessian to investigate locations of the local maxima. 

1. First Derivative: 

Differentiating w.r.t. the free variables we obtain: 

df , Poo sr- ^ ^^(poo+Poi) 
- log 2^ flj log ■ 



dpio Pio ^ aiPio + aiPn 

- — = log > 6i log . 

dpoi Poi ^ bipoi + bipu 



The condition V/ = implies that 



, Poo , ai{poo+poi) 

log — = > ai log — (10) 

Pio ^ a-iPio + aiPu 

1 Poo V^, 1 &i(POO+Plo) 

log — = 2^^ilog-- ^ . (11) 



Poi ^ bipoi + bipii 
Using the concavity of logarithm, we have 

Poo ajiPoo+Poi) 

Pio ~ ^ fliPio + OiPll 

Poo ^ fef(Poo +P10) ^^2) 

Po\ ~ ^ bipoi + bipn ' 

where the equalities hold iff. (using Remark [2]) 

(2^ = CaO'i^ bi — Cfyb^j 

for some constants c^, Cb respectively. 

However since Oj = Oj = 1 we obtain that Ca = 1 (similarly Cb = 1). Thus equalities hold 

iff. 

ai = ai, bi = bi. (13) 
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2. Second Derivative: 

We now compute the Hessian G = V^/, The second derivatives are 

Opfo Poo PlO POO+POl CLiPio + CliPii 

Gl2 = G21 = 

Poo 

^ _d'f _ 1 1 , 1 , 6f 



5poi ^"00 Poi Poo + PlO ^Moi + ^'iPll 

As poi > 0, we have < - ^ - ^ + poo+poi + ^ < Similarly we have G22 < 0. For G with 
Gil < and G22 < to be negative semi-definite, it is necessary and sufficient that det(G) > 0. 
From (dni) and ^ we have 

Gu>---- + ^^+ 



Poo PlO POO+POl Pio(poo+Poi) 
Poi(poo +P10) 



PooPio(poo +P01)' 
Similarly from (jlip and ()12p we have 

Plo(POO +POI; 



G22 > 



PooPoi(poo +P10) 



It is clear that equalities in the above two inequalities hold iff. ()13l) holds. 
Since Gn, G22 < we have 

r< r< ^ POl(POO+Plo) Plo(POO+Pol) 1 ^2 

(j11Cj22 S 7 \ T • -, ; T — — '^I2i 

PooPiolPoo +P01) PooPoilPoo +P10) Poo 



with equality holding only if (|T3ll holds. 

Thus det(G) < or there is no local minima when all puv > unless the channel parameters 
satisfy (|13p . However when ()13p holds, we know that the inequality is true as it corresponds to the 
special setting SSI. 

This completes the argument that the inequality is indeed true when X = U AV as we have 
already shown the validity of the inequality at the vertices of the region defined by (piOiPoi)) the 
possible locations of the local maxima of the LHS. 

3.4 Proof for Case X = U®V 

Similar to the "and" case; we will show that nontrivial local maxima can't be achieved when all 
Puv > 0. And when at least one of puv equals zero, it reduces to the case X = U AV. 

3.4.1 Case C5-1: at least one of Puv = 

This case can be reduced to the group G3 or G4, and further reduced to the case X = U AV . For 
example, if poi = 0, X = U (B V is a special case of X = U AV. 
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3.4.2 Case C5-2: all p^v > 

Just as in [7] we will consider a more general perturbation (see (7] for the motivation). 

Consider a perturbation q{u, v, x) = p{u, v, x) + eX{u, v, x) for some e > 0. For a valid pertur- 
bation, we require that Aqoi, Aqio, Aioo, Am > as the corresponding -y, x) are zero. Further 
let us require the perturbation maintains P{X = 0), that is 

Aooo + Aoio + Aioo + Alio = 

Aooi + Aon + Aioi + Am = 0. (14) 

For any perturbation that satisfies the above conditions at any local maximum, it must be true 
that the first derivative cannot be positive. This implies that 

H^{U,V) - He[x\ux]{U,Y) - He[x\v,z]{V,Z) < (15) 

where 

H\{U, V) = - (Aooo + Aooi) log Poo - (Aoio + Aon) logpoi 

- (Aioo + Aioi) logpio - (Alio + Am) fogpn 

He[\\ux]{U-, y) = - X^[ai(Aooo + Aoio) + aj(Aooi + Aqu)] log[aipoo + OiPoi] 

- ^[ai(Aioo + Alio) + ai(Aioi + Am)] log[aipii + a^pio] 
HE[\\v,z]{y^ Z) = - ^[&j(Aooo + Aioo) + ^j(Aooi + Aioi)] log[6jpoo + hpio] 

- ^[6i(Aoio + Alio) + ^j(Aoii + Am)] log[6iPii + hpoi]- 

From Eqn. ()14p . we express Aooo ai^d Aon in the term of other \{u,v,x) variables, that is 

Aooo = —Aoio — Aioo — Alio, 
Aon = —Aooi — Aioi — Am. 

Substituting the above equations into Eqn. ljlSp . we have 

+ (Aoio + Aioo + Alio - Aooi) logpoo - (Aioo + Aioi) logpio 
+ (Aooi + Aioi + Am - Aqio) logpoi - (Ano + Am) logpn 

fliPoo + OiPOl 



- + X^Ioil-^ioo + Alio) + ai(Aioi + Am)] log 



fliPii + fliPio 



+ y][^i(Aoio + Alio) - ^i(Aooi + Aioi)] log ^ (16) 

&iPll + OjPOl 

Since (fT6]) holds for any Ano and any nonnegative Aqiq, Aiqo, it implies that 

, Poo , fljPoo + OiPoi , , 1 ^iPoo + ^iPio 

log — = > ai log — h > h log 

pii ^ aipii + aipio ^ bipii + biPoi 

1 Poo . , 1 biPoo + kpio „^ 
log — <> 6ilog- (17) 

Poi ^ bipu + ^iPoi 
1 Poo / 1 aiPoo + fliPoi 

log — <> ailog — . (18) 

pio ^ Qipn + fliPio 
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These implications come from computing the coefficients of Ano, Aqio, and Aioo- The above three 
equations lead to 

1 Poo / 1 Poo 
log — < log — 

PoiPio Pll 
=^ PooPii < PoiPio- (19) 

Similarly, since the inequality (fTBll also holds for any Aioi and any nonnegative Aqoi, Am, we obtain 
that 

log — = + Z^<^i log Z^bi log 

Pio ^ aipu + aipio ^ b-pu + bipoi 

log — <-ybi log (20) 

Poo ^ bipu + biPoi 

The above three equations lead to 



log — <+> Oilog — . (21) 

Pll ^ aipn + ciipio 



, Poi / 1 Poi 
log — < log — 

PooPii Pio 
=^ PooPu > PoiPio- (22) 

Combining (119p and (122p we obtain that 

PooPii = PoiPio- (23) 

This equality means that the equality holds in ([HD, ([H]), ([201), and (pT]) . 
In particular, the equalities in ()17p and (I20p implies that 

, Poo , 1 ^jPoo + 6iPio I 1 ^iPoo + biPio 
log — = > bi log = Z^bi log ^ . 

Poi bipn + bipoi bipn + kpoi 

Taking a weighted sum, we get 

/ , M Poo Y^., , ? ^, bipoo + kpio 

(Poo + Pio) log — = > ^(OiPoo + OiPio) log -; ^ (24) 

Poi ^ biPu + biPoi 

From above and using K-L divergence, we have 

Poo ^iPoo + biPio , biPoo + biPiQ 



1 POO \ ^ ^JiPoo -r "iPio , 
log — =2^ log 



Poi ^ Poo + Pio bipn + biPoi 

^ , Poo + Pio , Poo 
> log = log — 

Pll + Poi Poi 

Notice the last equality holds since pooPii = PoiPio- Since the K-L divergence inequality is indeed 
an equality, we require that 

^iPoo + biPio ^ Poo 
biPii + biPoi Poi 
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From the above we obtain 

{poi-pii){bi-k) = 0. (25) 
Similarly using the fact that we have equalities in (jlSp and (|2ip . we can obtain 

{pio - Pii)iai - ai) = 0. (26) 

Now we have two cases 

1. bi = 6j, or ai = aj. In this case the Theorem holds (special setting SSI). 

2- Poi = Pii, Pio = Pii- Combining this with pooPii = PoiPio (Eqn. ([23| )) one obtains that 
Puv = 1/4, and as a result U,V and X are mutually independent. The Theorem holds 
(special setting SS2). 

If neither of these two cases is satisfied, there would be no local maxima for puv > 0. This shows 
that the inequality indeed holds when X = U (BV. This completes the proof of Theorem [TJ 

4 Conclusion 

An information theoretic inequality is established for binary input broadcast channels. This can be 
used to show that the sum-rate given by Marton's inner bound is indeed equivalent to that given 
by randomized time-division strategy. 

The proof technique is directly motivated from [7] and generalizes the result there. Clearly the 
inequality fails when \X\ > 3 (for instance, the Blackwell channel), so a natural question is whether 
there is a correct generalization for higher cardinality input-alphabets. 

It would also be useful to find a more intuitive (geometric) argument to shed more light into 
the actual counting of the sizes of typical sets. Here is an equivalent formulation which is related 
to the sizes of certain typical sets. It can be shown that the information inequality is equivalent to 
showing that 

H{U\Y) + H{V\Z) > mm{H{UV\Y), H{UV\Z)} 
whenever ([/, V) ^ X ^ {Y, Z) forms a Markov chain, X = f{U, V) and |X| = 2. 

Acknowedgements 

The guess that the inequality (Theorem ([T])) may hold in this generality was primarily motivated 
from another problem that the authors were working with Shlomo Shamai. Indeed the original guess 
of the authors were that this inequality may hold for binary-input output-symmetric broadcast 
channels. When the proof of this materialized, the authors realized that they had not used the fact 
that the outputs needed to be symmetric. Therefore the authors would like to express their thanks 
to Shlomo Shamai for his part in their work on binary- input output-symmetric broadcast channels. 

The authors are also grateful to Raymond Yeung for his insightful comments about the rela- 
tionship of this inequality to F^. 



12 



REFERENCES 



References 

[1] T. Cover, "Broadcast channels," IEEE Trans. Info. Theory, vol. IT-18, pp. 2-14, January, 1972. 

[2] , "Comments on broadcast channels," IEEE Trans. Info. Theory, vol. IT-44, pp. 2524-2530, 

October, 1998. 

[3] K. Marton, "A coding theorem for the discrete memoryless broadcast channel," IEEE Trans. 
Info. Theory, vol. IT-25, pp. 306-311, May, 1979. 

[4] A. A. Gohari and V. Anantharam, "Evaluation of marton's inner bound for the general broad- 
cast channel," CoRR, vol. abs/0904.4541, 2009. 

[5] C. Nair and A. El Gamal, "An outer bound to the capacity region of the broadcast channel," 
IEEE Trans. Info. Theory, vol. IT-53, pp. 350-355, January, 2007. 

[6] C. Nair and V. W. Zizhou, "On the inner and outer bounds for 2-receiver discrete memoryless 
broadcast channels," Proceedings of the ITA Workshop, 2008. 

[7] V. Jog and C. Nair, "An information inequality for the bssc channel," 2009. [Online]. Available: 
http:/ / www.citebase.org/ abstract?id=oai:arXiv.org:0901 . 1492 1 



13 



